Skip to content

Conversation

scottmcm
Copy link
Member

In optimized builds GVN gets rid of these already, but in opt-level=0 we actually make allocas for this, which particularly impacts ?-style things that use actually-only-one-variant types like this.

While doing so, rewrite LocalAnalyzer::process_place to be non-recursive, solving a 6+ year old FIXME.

r? codegen

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 23, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jul 23, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Comment on lines +10 to +31
#[no_mangle]
pub fn make_unmake_result_never(x: i32) -> i32 {
// CHECK-LABEL: define i32 @make_unmake_result_never(i32 %x)
// CHECK: start:
// CHECK-NEXT: ret i32 %x

let y: Result<i32, Never> = Ok(x);
let Ok(z) = y;
z
}

#[no_mangle]
pub fn extract_control_flow_never(x: ControlFlow<&str, Never>) -> &str {
// CHECK-LABEL: define { ptr, i64 } @extract_control_flow_never(ptr align 1 %x.0, i64 %x.1)
// CHECK: start:
// CHECK-NEXT: %[[P0:.+]] = insertvalue { ptr, i64 } poison, ptr %x.0, 0
// CHECK-NEXT: %[[P1:.+]] = insertvalue { ptr, i64 } %[[P0]], i64 %x.1, 1
// CHECK-NEXT: ret { ptr, i64 } %[[P1]]

let ControlFlow::Break(s) = x;
s
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare today: https://rust.godbolt.org/z/9dezqPq6n

define i32 @make_unmake_result_never(i32 %x) unnamed_addr {
start:
  %y = alloca [4 x i8], align 4
  store i32 %x, ptr %y, align 4
  %z = load i32, ptr %y, align 4
  ret i32 %z
}

define { ptr, i64 } @extract_control_flow_never(ptr align 1 %0, i64 %1) unnamed_addr {
start:
  %x = alloca [16 x i8], align 8
  store ptr %0, ptr %x, align 8
  %2 = getelementptr inbounds i8, ptr %x, i64 8
  store i64 %1, ptr %2, align 8
  %s.0 = load ptr, ptr %x, align 8
  %3 = getelementptr inbounds i8, ptr %x, i64 8
  %s.1 = load i64, ptr %3, align 8
  %4 = insertvalue { ptr, i64 } poison, ptr %s.0, 0
  %5 = insertvalue { ptr, i64 } %4, i64 %s.1, 1
  ret { ptr, i64 } %5
}

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

In optimized builds GVN gets rid of these already, but in `opt-level=0` we actually make `alloca`s for this, which particularly impacts `?`-style things that use actually-only-one-variant types like this.
@scottmcm
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Jul 23, 2025

⌛ Trying commit 6a5c7e0 with merge 88eefbd

To cancel the try build, run the command @bors try cancel.

rust-bors bot added a commit that referenced this pull request Jul 23, 2025
No longer need `alloca`s for consuming `Result<!, i32>` and similar
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 23, 2025
@rust-bors
Copy link

rust-bors bot commented Jul 23, 2025

☀️ Try build successful (CI)
Build commit: 88eefbd (88eefbd05a766727d674a8f4a54e38670ce81d3b, parent: a7a1618e6c835f1f00940ad72203d05808209a0d)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (88eefbd): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary -5.4%, secondary -2.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-5.4% [-5.4%, -5.4%] 1
Improvements ✅
(secondary)
-2.6% [-2.6%, -2.6%] 1
All ❌✅ (primary) -5.4% [-5.4%, -5.4%] 1

Cycles

Results (primary -0.4%, secondary -0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.3% [2.3%, 2.3%] 1
Regressions ❌
(secondary)
3.3% [2.6%, 4.1%] 3
Improvements ✅
(primary)
-3.2% [-3.2%, -3.2%] 1
Improvements ✅
(secondary)
-4.4% [-7.2%, -2.8%] 3
All ❌✅ (primary) -0.4% [-3.2%, 2.3%] 2

Binary size

Results (primary 0.0%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 4
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
-0.0% [-0.0%, -0.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [-0.0%, 0.0%] 5

Bootstrap: 463.694s -> 465.155s (0.32%)
Artifact size: 374.58 MiB -> 374.70 MiB (0.03%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 23, 2025
@scottmcm
Copy link
Member Author

Those html5ever binary size regressions are 336 bytes, 328 bytes, 328 bytes, and 328 bytes, which I'm not worried about.

Thus I would consider this very perf-neutral and thus good to go.

(I was hoping for a bit better than that, but for neutral I think it's fine, and we can try things like rustc_always_inline on Result::branch and such in a future PR to make this more applicable. Similarly this restructuring will hopefully make it easier to pull in some changes from #138582 and thus make it applicable to even more things later as well.)

With no icount changes I'll demote it to
@bors rollup=iffy
but feel free to pick a different level when reviewing if you feel differently.

@WaffleLapkin
Copy link
Member

@bors r+

@bors
Copy link
Collaborator

bors commented Jul 25, 2025

📌 Commit 6a5c7e0 has been approved by WaffleLapkin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 25, 2025
@bors
Copy link
Collaborator

bors commented Jul 27, 2025

⌛ Testing commit 6a5c7e0 with merge 86ef320...

@bors
Copy link
Collaborator

bors commented Jul 27, 2025

☀️ Test successful - checks-actions
Approved by: WaffleLapkin
Pushing 86ef320 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 27, 2025
@bors bors merged commit 86ef320 into rust-lang:master Jul 27, 2025
13 checks passed
@rustbot rustbot added this to the 1.90.0 milestone Jul 27, 2025
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 052114f (parent) -> 86ef320 (this PR)

Test differences

Show 7 test diffs

Stage 1

  • [codegen] tests/codegen-llvm/enum/enum-transparent-extract.rs: [missing] -> pass (J2)

Stage 2

  • [codegen] tests/codegen-llvm/enum/enum-transparent-extract.rs: [missing] -> pass (J0)
  • [codegen] tests/codegen-llvm/enum/enum-transparent-extract.rs: [missing] -> ignore (only executed when the pointer width is 64bit) (J1)

Additionally, 4 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 86ef32029427cfc4161a3fd7a51992302f7c5552 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-aarch64-linux: 5818.3s -> 7981.8s (37.2%)
  2. x86_64-apple-1: 7188.0s -> 9432.7s (31.2%)
  3. aarch64-apple: 5664.2s -> 4750.2s (-16.1%)
  4. x86_64-apple-2: 6009.0s -> 5349.6s (-11.0%)
  5. pr-check-1: 1666.6s -> 1530.5s (-8.2%)
  6. dist-aarch64-apple: 5778.0s -> 6225.2s (7.7%)
  7. x86_64-msvc-ext3: 5985.4s -> 6429.8s (7.4%)
  8. x86_64-gnu-llvm-19-1: 3537.6s -> 3292.0s (-6.9%)
  9. dist-ohos-armv7: 4172.3s -> 3894.7s (-6.7%)
  10. dist-s390x-linux: 4811.2s -> 5085.5s (5.7%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (86ef320): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (secondary 3.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.4% [2.0%, 8.0%] 7
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.5% [-1.5%, -1.5%] 1
All ❌✅ (primary) - - 0

Cycles

Results (primary -1.1%, secondary -5.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
1.6% [1.6%, 1.6%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.5% [-2.8%, -2.2%] 2
Improvements ✅
(secondary)
-5.2% [-9.7%, -2.5%] 15
All ❌✅ (primary) -1.1% [-2.8%, 1.6%] 3

Binary size

Results (primary 0.0%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 4
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
-0.0% [-0.0%, -0.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [-0.0%, 0.0%] 5

Bootstrap: 467.835s -> 467.671s (-0.04%)
Artifact size: 376.68 MiB -> 376.75 MiB (0.02%)

@scottmcm
Copy link
Member Author

(Those cycle changes look fake in the couple I checked.)

@scottmcm scottmcm deleted the ssa-enums-v0 branch July 27, 2025 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants